chore(avm)!: Bytecode Retrieval pre-audit docs by MirandaWood · Pull Request #20718 · AztecProtocol/aztec-packages

MirandaWood · 2026-02-20T12:23:43Z

Bytecode retrieval pre-audit PR including:

New comments/documentation
Rearranging code
Renaming variables/relations

This trace does a lot of delegation, so the pre audit has been completed assuming the target circuits constrain as expected! The actual columns used in these lookups have been checked though.

Closes AVM-54

TODOs/Notes: See comments for those to discuss (otherwise, I need to add ~~preconditions, doc on why we zero some properties on error, and tracegen tests~~ Update: complete!).

barretenberg/cpp/pil/vm2/bytecode/bc_retrieval.pil

barretenberg/cpp/src/barretenberg/vm2/simulation/events/bytecode_events.hpp

barretenberg/cpp/src/barretenberg/vm2/tracegen/bytecode_trace.cpp

barretenberg/cpp/pil/vm2/bytecode/bc_retrieval.pil

jeanmon

Very good work overall!

barretenberg/cpp/pil/vm2/bytecode/bc_retrieval.pil

barretenberg/cpp/src/barretenberg/vm2/simulation/events/bytecode_events.hpp

barretenberg/cpp/src/barretenberg/vm2/tracegen/bytecode_trace.test.cpp

jeanmon · 2026-02-26T10:04:10Z

barretenberg/cpp/src/barretenberg/vm2/simulation/gadgets/bytecode_manager.cpp

    // We convert the bytecode to a shared_ptr because it will be shared by some events.
    auto shared_bytecode = std::make_shared<std::vector<uint8_t>>(std::move(klass.packed_bytecode));
    // Emits BytecodeDecompositionEvent.
    decomposition_events.emit({ .bytecode_id = bytecode_id, .bytecode = shared_bytecode });


The fact that we emit "decomposition events" suggest that the circuit does not map simulation in an equivalent way. bc_retrieval.pil does not have interaction with bc_decomposition.pil IIRC.
Did you reason about the completeness at "higher level" of the mechanism of "bc retrieval, fetching, decomposition, hashing"? It looks like it is not fully modular and "local review" might not be sufficient.

So yes, in terms of the 'map' I added to the top:

* execution.pil --> bc_retrieval.pil --> contract_instance_retrieval.pil --> nullifier_check.pil * --> address_derivation.pil * --> update_check.pil * --> retrieved_bytecodes_tree_check.pil * --> class_id_derivation.pil * --> instr_fetching.pil --> bc_decomposition.pil <-> bc_hashing.pil * --> precomputed.pil

The lookup execution.pil --> bc_retrieval.pil occurs on the first row of each context (sel_first_row_in_context) regardless of any errors. The instruction fetching lookup occurs whenever we have sel_bytecode_retrieval_success which IIRC is a 'subset' of sel_first_row_in_context (i.e. it's only on at the first row of each context, but not vice versa).
I think that the decomposition event emission in simulation maps 1:1 to the circuit behaviour since we emit it when:

we have already emitted a retrieval event (i.e. can't have a decomp event without a retrieval event)

there is no error, otherwise we return early (sel_bytecode_retrieval_success == 0 in this case)

we have already accessed this bytecode (is_new_class == 0 and a decomp event will have already been emitted in this case)

I do think at the instruction fetching level this is hard to reason about, since in the circuit we link instr_fetching.pil --> bc_decomposition.pil but in simulation we 'link' bytecode_manager (=> bc_retrieval, bc_decomposition, bc_hashing) to the decomp event. It's the higher level 'link' at simulation's execute which puts together instruction fetching and bytecode management. So it works in a very roundabout way.

Maybe for the future we should consider moving emitting the bytecode decomp event via the instruction fetching call, to make it easier to reason about?

barretenberg/cpp/src/barretenberg/vm2/tracegen/bytecode_trace.cpp

… pass preaudit WIP

jeanmon

Great! I have only minor suggestions.

barretenberg/cpp/src/barretenberg/vm2/simulation/events/bytecode_events.hpp

barretenberg/cpp/src/barretenberg/vm2/tracegen/bytecode_trace.cpp